Locality-Sensitive Hashing for Data with Categorical and Numerical Attributes Using Dual Hashing

نویسنده

  • Keon-Myung Lee
چکیده

Locality-sensitive hashing techniques have been developed to efficiently handle nearest neighbor searches and similar pair identification problems for large volumes of high-dimensional data. This study proposes a locality-sensitive hashing method that can be applied to nearest neighbor search problems for data sets containing both numerical and categorical attributes. The proposed method makes use of dual hashing functions, where one function is dedicated to numerical attributes and the other to categorical attributes. The method consists of creating indexing structures for each of the dual hashing functions, gathering and combining the candidates sets, and thoroughly examining them to determine the nearest ones. The proposed method is examined for a few synthetic data sets, and results show that it improves performance in cases of large amounts of data with both numerical and categorical attributes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressed Image Hashing using Minimum Magnitude CSLBP

Image hashing allows compression, enhancement or other signal processing operations on digital images which are usually acceptable manipulations. Whereas, cryptographic hash functions are very sensitive to even single bit changes in image. Image hashing is a sum of important quality features in quantized form. In this paper, we proposed a novel image hashing algorithm for authentication which i...

متن کامل

Image authentication using LBP-based perceptual image hashing

Feature extraction is a main step in all perceptual image hashing schemes in which robust features will led to better results in perceptual robustness. Simplicity, discriminative power, computational efficiency and robustness to illumination changes are counted as distinguished properties of Local Binary Pattern features. In this paper, we investigate the use of local binary patterns for percep...

متن کامل

Multi-Level Spherical Locality Sensitive Hashing For Approximate Near Neighbors

This paper introduces “Multi-Level Spherical LSH”: parameter-free, a multi-level, data-dependant Locality Sensitive Hashing data structure for solving the Approximate Near Neighbors Problem (ANN). This data structure is a modified version multi-probe adaptive querying algorithm, with the potential of achieving a O(np + t) query run time, for all inputs n where t <= n. Keywords—Locality Sensitiv...

متن کامل

Locality-Sensitive Hashing with Margin Based Feature Selection

We propose a learning method with feature selection for Locality-Sensitive Hashing. Locality-Sensitive Hashing converts feature vectors into bit arrays. These bit arrays can be used to perform similarity searches and personal authentication. The proposed method uses bit arrays longer than those used in the end for similarity and other searches and by learning selects the bits that will be used....

متن کامل

Approximate Computation of Object Distances by Locality-Sensitive Hashing

We propose an approximate computation technique for inter-object distances for binary data sets. Our approach is based on the locality sensitive hashing, scales up with the number of objects and is much faster than the “brute-force” computation of these distances.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Fuzzy Logic and Intelligent Systems

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2014